VOILA: An Optimised Dialogue System for Interactively Learning Visually-Grounded Word Meanings (Demonstration System)
نویسندگان
چکیده
We present VOILA: an optimised, multimodal dialogue agent for interactive learning of visually grounded word meanings from a human user. VOILA is: (1) able to learn new visual categories interactively from users from scratch; (2) trained on real human-human dialogues in the same domain, and so is able to conduct natural spontaneous dialogue; (3) optimised to find the most effective trade-off between the accuracy of the visual categories it learns and the cost it incurs to users. VOILA is deployed on Furhat1, a humanlike, multi-modal robot head with backprojection of the face, and a graphical virtual character.
منابع مشابه
Interactively Learning Visually Grounded Word Meanings from a Human Tutor
We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework Dynamic Syntax and Type Theory with Records (DS-TTR) with a set of visual classifiers that are learned throughout the interaction and which ground the meaning representations that it produces. We ...
متن کاملLearning how to Learn: An Adaptive Dialogue Agent for Incrementally Learning Visually Grounded Word Meanings
We present an optimised multi-modal dialogue agent for interactive learning of visually grounded word meanings from a human tutor, trained on real human-human tutoring data. Within a life-long interactive learning period, the agent, trained using Reinforcement Learning (RL), must be able to handle natural conversations with human users, and achieve good learning performance (i.e. accuracy) whil...
متن کاملThe BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings
We motivate and describe a new freely available human-human dialogue data set for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a novel task, where a Learner needs to learn invented...
متن کاملTraining an adaptive dialogue policy for interactive learning of visually grounded word meanings
We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework Dynamic Syntax and Type Theory with Records (DS-TTR) with a set of visual classifiers that are learned throughout the interaction and which ground the meaning representations that it produces. We ...
متن کاملIncremental Generation of Visually Grounded Language in Situated Dialogue (demonstration system)
We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor (Yu et al., ). The system integrates an incremental, semantic, and bidirectional grammar framework – Dynamic Syntax and Type Theory with Records (DS-TTR1, (Eshghi et al., 2012; Kempson et al., 2001)) – with a set of visual classifiers that are learned throughout the intera...
متن کامل